Key Concepts for Parallel Out-of-Core LU Factorization
نویسندگان
چکیده
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left-looking variant of the LU factorization algorithm is shown to require less I/O to disk than the rightlooking variant, and is used to develop a parallel, out-of-core implementation. This implementation makes use of a small library of parallel I/O routines, together with ScaLAPACK and PBLAS routines. Results for runs on an Intel Paragon are presented and interpreted using a simple performance model.
منابع مشابه
Parallel LU Factorization on GPU Cluster
This paper describes our progress in developing software for performing parallel LU factorization of a large dense matrix on a GPU cluster. Three approaches, with increasing software complexity, are considered: (i) a naive “thunking” approach that links the existing parallel ScaLAPACK software library with cuBLAS through a software emulation layer; (ii) a more intrusive magmaBLAS implementation...
متن کاملThe design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines
This paper describes the design and implementation of three core factorization routines—LU, QR, and Cholesky—included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. The full matrix is stored on disk and the factorization routines transfer sub-matrice panels into memory. The ‘l...
متن کاملTHE USE OF SEMI INHERITED LU FACTORIZATION OF MATRICES IN INTERPOLATION OF DATA
The polynomial interpolation in one dimensional space R is an important method to approximate the functions. The Lagrange and Newton methods are two well known types of interpolations. In this work, we describe the semi inherited interpolation for approximating the values of a function. In this case, the interpolation matrix has the semi inherited LU factorization.
متن کاملThe Design and Implementation of the Parallel Out - of - coreScaLAPACK
This paper describes the design and implementation of three core factorization routines | LU, QR and Cholesky | included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to t entirely in physical memory. An image of the full matrix is maintained on disk and the factorization routines transfer sub-matrices into mem...
متن کاملHigh-Performance Out-of-Core Sparse LU Factorization
We present an out-of-core sparse nonsymmetric LU -factorization algorithm with partial pivoting. We have implemented the algorithm and our experiments show that it can easily factor matrices whose factors are larger than main memory at rates comparable to those of an in-core solver. The algorithm is novel in several respects, including the use of panels that are larger than memory and the use o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Computing
دوره 23 شماره
صفحات -
تاریخ انتشار 1997